Title: Anomaly Synthesis and Detection in Accounting Data via a Generative Adversarial Network

Description:
This repository contains the source code and experimental setup for the paper "Anomaly Synthesis and Detection in Accounting Data via a Generative Adversarial Network (GAN)".
The implementation focuses on generating synthetic anomalous accounting records and detecting irregular patterns using an adversarial learning framework. The code supports reproducible experiments, modular configuration, and scalable data loading for both supervised and semi-supervised tasks.

Dataset Information:
- Source: The dataset consists of anonymized accounting transaction logs collected from publicly available financial datasets and internal synthetic records.
- Structure: Each sample contains time-indexed transactional features (amount, frequency, account type, vendor ID, timestamp, and category labels).
- Split:
  Training set: 70%
  Validation set: 15%
  Test set: 15%
- Preprocessing: Missing values are imputed using forward-fill; numerical features are normalized to [0,1] range; categorical attributes are one-hot encoded.
- Note: All datasets are anonymized and comply with institutional data protection standards.

Code Information:
Project structure:
data/
  preprocess.py          # Data cleaning and normalization
  dataloader.py          # Batch loader with augmentation
model/
  generator.py           # GAN generator network
  discriminator.py       # GAN discriminator
  trainer.py             # Adversarial training loop
utils/
  metrics.py             # Evaluation metrics (Accuracy, F1, AUC)
  visualization.py       # Loss and feature visualization
main.py                  # Main script to execute experiments
config.yaml              # Model and training configuration
requirements.txt         # Environment dependencies

Usage Instructions:
1. Clone the repository:
   git clone https://github.com/username/anomaly-GAN-accounting.git
   cd anomaly-GAN-accounting
2. Create environment and install dependencies:
   conda create -n anomalyGAN python=3.10
   conda activate anomalyGAN
   pip install -r requirements.txt
3. Prepare dataset:
   Place the dataset under the data/ directory.
   python data/preprocess.py --input data/raw --output data/processed
4. Train the model:
   python main.py --config config.yaml
5. Evaluate the model:
   python main.py --mode test --weights checkpoints/best_model.pth
6. Visualize results:
   python utils/visualization.py --log_dir runs/

Requirements:
- Operating System: Ubuntu 22.04 LTS or Windows 11
- Python Version: 3.10+
- Core Libraries:
  PyTorch 2.3.1
  torchvision 0.18.1
  pandas 2.2.2
  NumPy 1.26.4
  scikit-learn 1.5.1
  Matplotlib 3.8.4
  tqdm 4.66+
- Hardware:
  GPU: NVIDIA RTX 4090 (24 GB VRAM)
  CPU: Intel i9-13900K or equivalent
  Memory: ≥ 32 GB

Methodology:
1. Data Processing:
   Accounting data are normalized and structured into temporal windows. Each transaction sequence is represented as a multi-dimensional vector capturing statistical and categorical attributes.
2. Model Architecture:
   The GAN framework includes a generator (G) that synthesizes anomalous transaction patterns and a discriminator (D) that distinguishes real from synthetic samples.
   Objective:
   min_G max_D E[log D(x)] + E[log (1 - D(G(z)))]
   The objective is optimized via Adam optimizer (lr=0.0002, β1=0.5, β2=0.999).
3. Training Setup:
   Batch size: 128
   Epochs: 200
   Loss function: Adversarial + Reconstruction + Regularization
   Metrics: Accuracy, F1-score, Precision, Recall, AUC
4. Evaluation:
   The model is compared against Autoencoder, Isolation Forest, and VAE baselines.
   Results are averaged over three runs with fixed random seeds for reproducibility.

Citations:
If you use this code or dataset, please cite:
[Author(s)]. (2025). "Anomaly Synthesis and Detection in Accounting Data via a Generative Adversarial Network." [Journal Name], [Volume(Issue)], [Page Range]. DOI: [to be assigned].

License & Contribution Guidelines:
This repository is released under the MIT License.
Contributions and bug reports are welcome. Please submit pull requests or open issues following the standard GitHub workflow.
